1 00:00:41,230 --> 00:00:43,800 On completion of this training sequence, 2 00:00:43,800 --> 00:00:49,510 you will be able to use your voice as a tool to produce high quality subtitles. 3 00:00:49,510 --> 00:00:53,540 In particular, you are going to learn about how to use it 4 00:00:53,540 --> 00:00:57,430 to support flawless dictation through voice projection, 5 00:00:57,430 --> 00:01:01,690 pacing, articulation and voice modulation. 6 00:01:02,650 --> 00:01:05,350 This is the agenda of this presentation. 7 00:01:11,360 --> 00:01:15,680 Voice projection is strictly related to breathing. 8 00:01:15,680 --> 00:01:17,230 It is the use of voice 9 00:01:17,230 --> 00:01:22,050 in a way that the person to whom you are talking listens to you. 10 00:01:22,050 --> 00:01:25,650 When respeaking, you need to talk to a machine. 11 00:01:25,650 --> 00:01:29,840 In normal talking, we use the upper part of our lungs, 12 00:01:29,840 --> 00:01:35,750 while in respeaking we need to use belly breathing as automatically as possible, 13 00:01:35,750 --> 00:01:39,640 so as to be able and respeak on the long run. 14 00:01:39,640 --> 00:01:40,660 This means 15 00:01:40,660 --> 00:01:45,280 that your voice projection should be regular and well-balanced, 16 00:01:45,280 --> 00:01:49,340 as it is aimed at excellent use of voice commands 17 00:01:49,340 --> 00:01:53,930 in a manner that the machine can consistently listen to your voice 18 00:01:53,930 --> 00:01:55,150 as you trained it. 19 00:01:56,370 --> 00:01:57,530 To do this, 20 00:01:57,530 --> 00:02:02,670 you need to keep practicing belly breathing and speaking at the same time. 21 00:02:02,670 --> 00:02:04,850 When speaking or respeaking, 22 00:02:04,850 --> 00:02:09,340 try to see if you manage to reduce the number of recognition errors 23 00:02:09,340 --> 00:02:12,900 by avoiding ups and downs in the volume. 24 00:02:12,900 --> 00:02:17,190 In particular, pay attention to thinking all sentences 25 00:02:17,190 --> 00:02:22,140 as if they needed a raising tone; articulating properly; 26 00:02:22,140 --> 00:02:28,880 avoiding sudden accelerating your dictation, and avoiding ups and downs in your dictation, 27 00:02:28,880 --> 00:02:32,740 but more importantly avoid talking to the machine 28 00:02:32,740 --> 00:02:37,030 the way you talk to humans, as in this demo of mine. 29 00:02:37,030 --> 00:02:41,840 I am not a native speaker of English, so don't focus on the way I pronounce words. 30 00:02:41,840 --> 00:02:46,460 Rather, try to imitate the way I project my voice. 31 00:03:04,350 --> 00:03:08,340 Pacing is the rate at which each of us speaks. 32 00:03:08,340 --> 00:03:12,670 In public speaking, they teach you to vary your speaking speed. 33 00:03:12,670 --> 00:03:14,150 Avoid doing this. 34 00:03:14,150 --> 00:03:16,400 As we have seen in the previous section, 35 00:03:16,400 --> 00:03:20,490 you need to talk to the machine the most robotic way possible. 36 00:03:20,490 --> 00:03:23,460 This allows for a good speech recognition. 37 00:03:23,460 --> 00:03:26,360 Pacing plays an important role in respeaking, 38 00:03:26,360 --> 00:03:30,490 as it is your capacity to speak clearly at a given speech rate. 39 00:03:30,490 --> 00:03:35,700 To understand how quickly you can talk, try and shadow the TV news, 40 00:03:35,700 --> 00:03:39,170 a parliamentary speech or a fast-rate webinar. 41 00:03:39,170 --> 00:03:45,110 In respeaking, pacing has to be adapted to the speaker’s rate constantly. 42 00:03:50,980 --> 00:03:54,280 Articulation is strictly related to pacing 43 00:03:54,280 --> 00:03:59,390 as you should not only speak quickly to keep the same pace as the speaker. 44 00:03:59,390 --> 00:04:02,760 You should also speak in a clear and distinctive manner, 45 00:04:02,760 --> 00:04:06,950 so that the machine can transcribe the words we have in mind 46 00:04:06,950 --> 00:04:09,820 and not words that sound similar. 47 00:04:09,820 --> 00:04:14,740 To do this, keep an eye on the articulation rules in your language. 48 00:04:14,740 --> 00:04:20,350 This is extremely important to reduce recognition errors while respeaking. 49 00:04:20,350 --> 00:04:25,560 To do so, it is sometimes necessary to change the way you pronounce a word 50 00:04:25,560 --> 00:04:30,510 through macros or other peculiar ways of articulating words, 51 00:04:30,510 --> 00:04:33,520 as per Element number 4. 52 00:04:33,520 --> 00:04:39,820 Always keep track of special articulation so as to make the most out of it. 53 00:04:40,550 --> 00:04:43,480 As a rule of thumb, valuable for many languages, 54 00:04:43,480 --> 00:04:47,110 remember to always articulate all syllables in a word 55 00:04:47,110 --> 00:04:50,610 and all words in a sentence the way they should, 56 00:04:50,610 --> 00:04:57,080 and not as you do it in your daily life, if this changes the way they are pronounced. 57 00:04:57,080 --> 00:05:01,930 With polycentric languages like English, this also means you need to check 58 00:05:01,930 --> 00:05:08,100 if the automatic speech recognition software that you use can be adjusted to your accent. 59 00:05:08,100 --> 00:05:11,570 In this case, be consistent with this standard 60 00:05:11,570 --> 00:05:17,210 and do not change the way you articulate depending on the accent of the speaker. 61 00:05:17,210 --> 00:05:21,500 Also, avoid stressing syllables too much, 62 00:05:21,500 --> 00:05:25,920 as a machine is not a human being. 63 00:05:25,920 --> 00:05:29,150 In case of possibly ambiguous words, 64 00:05:29,150 --> 00:05:33,310 pronounce the word without stressing a given syllable too much 65 00:05:33,310 --> 00:05:38,960 as in many languages vowel length can make the difference between two words, 66 00:05:38,960 --> 00:05:41,990 as in “shot” and “short”. 67 00:05:49,680 --> 00:05:53,770 Finally, voice modulation is the use you make of your voice 68 00:05:53,770 --> 00:05:56,580 in terms of volume and tone. 69 00:05:56,580 --> 00:06:02,290 In human conversation you adapt it to the message or effect you want to convey. 70 00:06:02,290 --> 00:06:06,410 In respeaking, this means to be consistent with the three aspects 71 00:06:06,410 --> 00:06:09,840 we have dealt with in the previous sections 72 00:06:09,840 --> 00:06:13,640 and keep modulation as robotic as possible. 73 00:06:13,640 --> 00:06:18,260 This may seem easy, but it can be a challenge with emotional speakers, 74 00:06:18,260 --> 00:06:20,800 as you may tend to imitate them. 75 00:06:20,800 --> 00:06:24,790 If you realise the number of recognition errors increases, 76 00:06:24,790 --> 00:06:28,160 a wrong voice modulation may be the reason. 77 00:06:34,560 --> 00:06:37,630 In this video lecture, we have seen how important it is 78 00:06:37,630 --> 00:06:43,440 to make a good use of your voice while breathing, so as to support dictation. 79 00:06:43,440 --> 00:06:46,800 In particular, we have seen how to command voice projection, 80 00:06:46,800 --> 00:06:53,070 pacing, articulation and voice modulation to avoid as many recognition errors 81 00:06:53,070 --> 00:06:56,570 while trying to subtitle challenging speakers.